Llama-3.3-Nemotron-Super-49B-v1.5 is an efficient large language model developed by NVIDIA, derived from Meta Llama-3.3-70B-Instruct. This model performs excellently in inference, chat interaction, and agent tasks. It significantly reduces memory usage through neural architecture search technology and supports a context length of 128K tokens. Its capabilities in multiple aspects such as mathematics, code, science, and tool invocation have been enhanced.
Natural Language Processing
TransformersEnglish